Skip to content

feat: implement a core tpch table func to generate all data#1

Merged
clflushopt merged 7 commits into
mainfrom
cl/feat/tpch-table-func-global
Jun 6, 2025
Merged

feat: implement a core tpch table func to generate all data#1
clflushopt merged 7 commits into
mainfrom
cl/feat/tpch-table-func-global

Conversation

@clflushopt
Copy link
Copy Markdown
Member

This change introduces a new table function SELECT * FROM tpch(scale_factor, write_to_disk, path) that generates all the individual tables in one go allows us to register a single UDTF instead of multiple ones.

Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT * FROM tpch(scale_factor, write_to_disk, path)

I think write_to_disk here can be derived from path. As a user, I would only specify path if i want to write to disk.

Another option for the "write the disk" feature might be to use the COPY command
This allows us to specify the path (location) and other write options, such as the parquet options. And aligns with the duckdb solution described here

Sidenote, I would love the ability to write multiple parquet files based on size (i.e. 512MB). Duckdb has the FILE_SIZE_BYTES option in the COPY command,. But i could not find a similar option in datafusion

Comment thread src/lib.rs Outdated
Comment thread src/lib.rs Outdated
@clflushopt
Copy link
Copy Markdown
Member Author

As of now I am happy with the feature set, I'll wait for another pair of eyes before merging and tagging a new release thanks @kevinjqliu

@clflushopt clflushopt merged commit 6e11566 into main Jun 6, 2025
1 check passed
Comment thread examples/tpchgen.rs
let sql_df = ctx.sql(&format!("SHOW TABLES;")).await?;
sql_df.show().await?;

let sql_df = ctx.sql(&format!("SELECT * FROM nation LIMIT 5;")).await?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants